Dataset and Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering

نویسندگان

  • Peng Li
  • Wei Li
  • Zhengyan He
  • Xuguang Wang
  • Ying Cao
  • Jie Zhou
  • Wei Xu
چکیده

While question answering (QA) with neural network, i.e. neural QA, has achieved promising results in recent years, lacking of large scale real-word QA dataset is still a challenge for developing and evaluating neural QA system. To alleviate this problem, we propose a large scale human annotated real-world QA dataset WebQA with more than 42k questions and 556k evidences. As existing neural QA methods resolve QA either as sequence generation or classification/ranking problem, they face challenges of expensive softmax computation, unseen answers handling or separate candidate answer generation component. In this work, we cast neural QA as a sequence labeling problem and propose an end-to-end sequence labeling model, which overcomes all the above challenges. Experimental results on WebQA show that our model outperforms the baselines significantly with an F1 score of 74.69% with word-based input, and the performance drops only 3.72 F1 points with more challenging character-based input.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Neural Domain Adaptation for Biomedical Question Answering

Factoid question answering (QA) has recently benefited from the development of deep learning (DL) systems. Neural network models outperform traditional approaches in domains where large datasets exist, such as SQuAD (≈ 100, 000 questions) for Wikipedia articles. However, these systems have not yet been applied to QA in more specific domains, such as biomedicine, because datasets are generally t...

متن کامل

Neural Question Answering at BioASQ 5B

This paper describes our submission to the 2017 BioASQ challenge. We participated in Task B, Phase B which is concerned with biomedical question answering (QA). We focus on factoid and list question, using an extractive QA model, that is, we restrict our system to output substrings of the provided text snippets. At the core of our system, we use FastQA, a state-ofthe-art neural QA system. We ex...

متن کامل

Answer Sequence Learning with Neural Networks for Answer Selection in Community Question Answering

In this paper, the answer selection problem in community question answering (CQA) is regarded as an answer sequence labeling task, and a novel approach is proposed based on the recurrent architecture for this problem. Our approach applies convolution neural networks (CNNs) to learning the joint representation of questionanswer pair firstly, and then uses the joint representation as input of the...

متن کامل

Hybrid Deep Open-Domain Question Answering

The explosion of textual data on the Internet over the past few years turns opendomain Question Answering into a scientifically challenging and commercially appealing research area. In this thesis proposal, I explain my both completed and on-going experiments with a hybrid deep neural system for answering open-domain questions. This system supplements knowledge graphs with the free-texts search...

متن کامل

Full-Time Supervision based Bidirectional RNN for Factoid Question Answering

Recently, bidirectional recurrent neural network (BRNN) has been widely used for question answering (QA) tasks with promising performance. However, most existing BRNN models extract the information of questions and answers by directly using a pooling operation to generate the representation for loss or similarity calculation. Hence, these existing models don’t put supervision (loss or similarit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1607.06275  شماره 

صفحات  -

تاریخ انتشار 2016